Search CORE

368 research outputs found

The molecular dimension of microbial species: 1. Ecological distinctions among, and homogeneity within, putative ecotypes of Synechococcus inhabiting the cyanobacterial mat of Mushroom Spring, Yellowstone National Park

Author: Becraft ED
Bryant DA
Cohan FM
Jensen SI
Kühl M
Roberts DW
Rusch DB
Ward DM
Wood JM
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

© 2015 Becraft, Wood, Rusch, Kühl, Jensen, Bryant, Roberts, Cohan and Ward. Based on the Stable Ecotype Model, evolution leads to the divergence of ecologically distinct populations (e.g., with different niches and/or behaviors) of ecologically interchangeable membership. In this study, pyrosequencing was used to provide deep sequence coverage of Synechococcus psaA genes and transcripts over a large number of habitat types in the Mushroom Spring microbial mat. Putative ecological species (putative ecotypes), which were predicted by an evolutionary simulation based on the Stable Ecotype Model (Ecotype Simulation), exhibited distinct distributions relative to temperature-defined positions in the effluent channel and vertical position in the upper 1 mm-thick mat layer. Importantly, in most cases variants predicted to belong to the same putative ecotype formed unique clusters relative to temperature and depth in the mat in canonical correspondence analysis, supporting the hypothesis that while the putative ecotypes are ecologically distinct, the members of each ecotype are ecologically homogeneous. Putative ecotypes responded differently to experimental perturbations of temperature and light, but the genetic variation within each putative ecotype was maintained as the relative abundances of putative ecotypes changed, further indicating that each population responded as a set of ecologically interchangeable individuals. Compared to putative ecotypes that predominate deeper within the mat photic zone, the timing of transcript abundances for selected genes differed for putative ecotypes that predominate in microenvironments closer to upper surface of the mat with spatiotemporal differences in light and O2 concentration. All of these findings are consistent with the hypotheses that Synechococcus species in hot spring mats are sets of ecologically interchangeable individuals that are differently adapted, that these adaptations control their distributions, and that the resulting distributions constrain the activities of the species in space and time

OPUS - University of Technology Sydney

A Catalog of Reference Genomes from the Human Microbiome

Author: Creasy HH
Grupa avtori
Highlander SK
Mitreva Makedonka
Nelson Karen
Rusch DB
Weinstock George M
Worley KC
Wortman JR
Publication venue
Publication date: 21/05/2010
Field of study

The human microbiome refers to the community of microorganisms including prokaryotes, viruses and microbial eukaryotes that populate the human body. The National Institutes of Health launched an initiative that focuses describing the diversity of microbial species associated with health and disease. The first phase of this initiative includes the sequencing of hundreds of microbial reference genomes, coupled to metagenomic sequencing from multiple body sites. Here we present results from an initial reference genome sequencing of 178 microbial genomes. From 547,968 predicted polypeptides that correspond to the gene complement of these strains “novel” polypeptides that had both unmasked sequence length > 100 amino acids and no BLASTP match to any non-reference entry in the nr subset were defined. This analysis resulted in a set of 30,867 polypeptides, of which 29,987 (~97%) were unique. In addition, this set of microbial genomes allows for ~ 40% of random sequences from the microbiome of the gastrointestinal tract to be associated with organisms based on the match criteria used. Insights into pan-genome analysis suggest that we are still far from saturating microbial species genetic datasets. In addition, the associated metrics and standards used by the group for quality assurance are presented

UGD Academic Repository

The importance of metagenomic surveys to microbial ecology: or why Darwin would have been a metagenomic scientist

Author: DB Rusch
EO Wilson
J Whitfield
JA Gilbert
JA Gilbert
JA Gilbert
JA Gilbert
JA Gilbert
RA Edwards
RK O'Dor
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Scientific discovery is incremental. The Merriam-Webster definition of 'Scientific Method' is "principles and procedures for the systematic pursuit of knowledge involving the recognition and formulation of a problem, the collection of data through observation and experiment, and the formulation and testing of hypotheses". Scientists are taught to be excellent observers, as observations create questions, which in turn generate hypotheses. After centuries of science we tend to assume that we have enough observations to drive science, and enable the small steps and giant leaps which lead to theories and subsequent testable hypotheses. One excellent example of this is Charles Darwin's Voyage of the Beagle, which was essentially an opportunistic survey of biodiversity. Today, obtaining funding for even small-scale surveys of life on Earth is difficult; but few argue the importance of the theory that was generated by Darwin from his observations made during this epic journey. However, these observations, even combined with the parallel work of Alfred Russell Wallace at around the same time have still not generated an indisputable 'law of biology'. The fact that evolution remains a 'theory', at least to the general public, suggests that surveys for new data need to be taken to a new level

Crossref

Springer - Publisher Connector

PubMed Central

Robust estimation of microbial diversity in theory and in practice

Author: A Chao
A Chao
A Gobet
A Ives
AE Magurran
AK Shaw
AO Kislyuk
B Haegeman
Bart Haegeman
BJM Bohannan
C Mora
C Pedrós-Alió
C Pedrós-Alió
C Quince
CA Lozupone
CE Shannon
CX Mao
DB Rusch
DE Dykhuizen
E Stackebrandt
EH Simpson
EO Wilson
H Tettelin
J Bunge
J Bunge
J Gans
JA Huber
JB Hughes
JL Green
John Moriarty
Jonathan Dushoff
Joshua S Weitz
Jérôme Hamelin
L Jost
L Øvreås
LFW Roesch
M Loreau
M Loreau
MC Horner-Devine
ML Sogin
MO Hill
N Fromin
NJ Gotelli
NJ Gotelli
P Kemp
P Loisel
PD Schloss
PD Schloss
Peter Neal
R Lande
RK Colwell
RM May
S Engen
SH Hong
SJ Bent
TJ Shen
TP Curtis
U Brose
V Torsvik
WB Whitman
WT Sloan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/02/2013
Field of study

Quantifying diversity is of central importance for the study of structure, function and evolution of microbial communities. The estimation of microbial diversity has received renewed attention with the advent of large-scale metagenomic studies. Here, we consider what the diversity observed in a sample tells us about the diversity of the community being sampled. First, we argue that one cannot reliably estimate the absolute and relative number of microbial species present in a community without making unsupported assumptions about species abundance distributions. The reason for this is that sample data do not contain information about the number of rare species in the tail of species abundance distributions. We illustrate the difficulty in comparing species richness estimates by applying Chao's estimator of species richness to a set of in silico communities: they are ranked incorrectly in the presence of large numbers of rare species. Next, we extend our analysis to a general family of diversity metrics ("Hill diversities"), and construct lower and upper estimates of diversity values consistent with the sample data. The theory generalizes Chao's estimator, which we retrieve as the lower estimate of species richness. We show that Shannon and Simpson diversity can be robustly estimated for the in silico communities. We analyze nine metagenomic data sets from a wide range of environments, and show that our findings are relevant for empirically-sampled communities. Hence, we recommend the use of Shannon and Simpson diversity rather than species richness in efforts to quantify and compare microbial diversity.Comment: To be published in The ISME Journal. Main text: 16 pages, 5 figures. Supplement: 16 pages, 4 figure

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL-INSU

The University of Manchester - Institutional Repository

Lancaster E-Prints

The significance of nitrogen cost minimization in proteomes of marine microorganisms

Author: A Dufresne
Alex M Dussaq
D Bordo
DB Rusch
DB Rusch
DG Capone
EE Snyder
F Partensky
FM Lauro
G Rocap
H Akashi
H Seligmann
I Barrai
J Frank
J Lv
JC Venter
JDJ Gilbert
JF Wu
JG Bragg
JH Martin
Joseph J Grzymski
K Gundersen
K Liolios
L Campbell
L Lindahl
M Simon
MJ Behrenfeld
MJ Dufton
MR Mulholland
N Price
P Baudouin-Cornu
P Baudouin-Cornu
PG Falkowski
PG Falkowski
PM Vitousek
RA Cox
RM Morris
S Karlin
SJ Giovannoni
SM Sowell
WS Broecker
ZI Johnson
Publication venue: Nature Publishing Group
Publication date
Field of study

Marine microorganisms thrive under low levels of nitrogen (N). N cost minimization is a major selective pressure imprinted on open-ocean microorganism genomes. Here we show that amino-acid sequences from the open ocean are reduced in N, but increased in average mass compared with coastal-ocean microorganisms. Nutrient limitation exerts significant pressure on organisms supporting the trade-off between N cost minimization and increased average mass of amino acids that is a function of increased A+T codon usage. N cost minimization, especially of highly expressed proteins, reduces the total cellular N budget by 2.7–10% this minimization in combination with reduction in genome size and cell size is an evolutionary adaptation to nutrient limitation. The biogeochemical and evolutionary precedent for these findings suggests that N limitation is a stronger selective force in the ocean than biosynthetic costs and is an important evolutionary strategy in resource-limited ecosystems

Crossref

PubMed Central

Computational Biology in Costa Rica: The Role of a Small Country in the Global Context of Bioinformatics

Author: A Troyo
AI Nilsson
BG Fry
BG Fry
Bruno Lomonte
CA Mathews
DB Rusch
E Moreno
E Moreno
E Moreno
Edgardo Moreno
F Warnecke
G Neshich
IT Paulsen
J Lamontagne
JJ Calvete
José-María Gutiérrez
L Sanz
LL Rodríguez
M Ohno
Philip E. Bourne
R Palacios
SM Whitfield
Y Angulo
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Introduction: The successful development of high throughput methods for DNA sequencing, transcriptomics, proteomics, and other –omics, has contributed to the emergence of novel possibilities for the examination of complex biological systems through computational analysis. These fields have witnessed unprecedented advances in high income countries. Nevertheless, the role of other nations needs to be examined in order to delineate their contribution within the global context of bioinformatics. Previous articles have focused on the expansion of Computational Biology in Brazil and Mexico [1],[2], two of the largest Latin American countries, and which have shown political commitment to foster their scientific development. Costa Rica is a small Central American country with a population of 4 million, with its territory 164 and 38 times smaller than Brazil and Mexico, respectively. Thus, it is interesting to visualize the possibilities and challenges of this low-income country in the context of the global bioinformatics endeavor.UCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias de la Salud::Instituto Clodomiro Picado (ICP

Repositorio Institucional de la Universidad de Costa Rica

Crossref

Directory of Open Access Journals

PubMed Central

Repositorio Académico de la Universidad Nacional de Costa Rica

Defining seasonal marine microbial community dynamics

Author: AJ Southward
Alice C McHardy
Ben Temperton
BL Maidak
Dawn Field
DB Nedwell
DB Rusch
DL Kirchman
E Pruesse
EK Costello
Ian Joint
J Gregory Caporaso
J Reeder
JA Fuhrman
JA Fuhrman
JA Fuhrman
JA Gilbert
JA Gilbert
Jack A Gilbert
Jed A Fuhrman
Jens Reeder
JG Caporaso
JJ Cullen
Joshua A Steele
K Lewis
KR Clarke
KR Clarke
Lars Steinbrück
MJ Church
ML Sogin
N Fierer
Paul Somerfield
Q Wang
RD Pingree
RM Morris
Rob Knight
SM Huse
Susan Huse
TZ DeSantis
Z Liu
Publication venue: Nature Publishing Group
Publication date: 01/01/2012
Field of study

Here we describe, the longest microbial time-series analyzed to date using high-resolution 16S rRNA tag pyrosequencing of samples taken monthly over 6 years at a temperate marine coastal site off Plymouth, UK. Data treatment effected the estimation of community richness over a 6-year period, whereby 8794 operational taxonomic units (OTUs) were identified using single-linkage preclustering and 21 130 OTUs were identified by denoising the data. The Alphaproteobacteria were the most abundant Class, and the most frequently recorded OTUs were members of the Rickettsiales (SAR 11) and Rhodobacteriales. This near-surface ocean bacterial community showed strong repeatable seasonal patterns, which were defined by winter peaks in diversity across all years. Environmental variables explained far more variation in seasonally predictable bacteria than did data on protists or metazoan biomass. Change in day length alone explains >65% of the variance in community diversity. The results suggested that seasonal changes in environmental variables are more important than trophic interactions. Interestingly, microbial association network analysis showed that correlations in abundance were stronger within bacterial taxa rather than between bacteria and eukaryotes, or between bacteria and environmental variables

Crossref

Plymouth Marine Science Electronic Archive (PlyMSEA)

PubMed Central

eScholarship - University of California

MPG.PuRe

NERC Open Research Archive

Short clones or long clones? A simulation study on the use of paired reads in metagenomics

Author: C von Mering
D Benson
D Bentley
D MacLean
Daniel H Huson
DB Rusch
DC Richter
DH Huson
DR Bentley
FW J Kuever
I Korf
J Frias-Lopez
JC Venter
JE Koenig
K Mavromatis
M Ashburner
M Margulies
Max Schubach
R Overbeek
R Seshadri
S Mitra
SF Altschul
SG Tringe
Suparna Mitra
T Urich
T Woyke
V Kunin
VM Markowitz
W Miller
W Qi
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Metagenomics is the study of environmental samples using sequencing. Rapid advances in sequencing technology are fueling a vast increase in the number and scope of metagenomics projects. Most metagenome sequencing projects so far have been based on Sanger or Roche-454 sequencing, as only these technologies provide long enough reads, while Illumina sequencing has not been considered suitable for metagenomic studies due to a short read length of only 35 bp. However, now that reads of length 75 bp can be sequenced in pairs, Illumina sequencing has become a viable option for metagenome studies. Results This paper addresses the problem of taxonomical analysis of paired reads. We describe a new feature of our metagenome analysis software MEGAN that allows one to process sequencing reads in pairs and makes assignments of such reads based on the combined bit scores of their matches to reference sequences. Using this new software in a simulation study, we investigate the use of Illumina paired-sequencing in taxonomical analysis and compare the performance of single reads, short clones and long clones. In addition, we also compare against simulated Roche-454 sequencing runs. Conclusion This work shows that paired reads perform better than single reads, as expected, but also, perhaps slightly less obviously, that long clones allow more specific assignments than short ones. A new version of the program MEGAN that explicitly takes paired reads into account is available from our website.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Analysis and comparison of very large metagenomes with fast clustering and functional annotation

Author: AC McHardy
AR Quinlan
B Rodriguez-Brito
D Sheskin
DB Rusch
DC Richter
DH Huson
E Portugaly
EA Dinsdale
EF DeLong
FE Angly
GW Tyson
H Noguchi
H Noguchi
H Teeling
H Teeling
J Shendure
JC Venter
K Mavromatis
KJ Hoff
L Krause
PD Schloss
R Seshadri
RK Aziz
S Yooseph
S Yooseph
SF Altschul
SG Tringe
SR Eddy
SR Gill
W Li
W Li
W Li
W Li
Weizhong Li
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes) are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP) was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from <url>http://tools.camera.calit2.net/camera/rammcap/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Streaming histogram sketching for rapid microbiome analytics

Author: A Sczyrba
AG Shaw
AL Greninger
AP Carrieri
B Grüning
BD Ondov
C Alcon-Giner
C Kakkanatt
D Yang
DB Rusch
F Pedregosa
G Benoit
G Cormode
H Mulcahy-O’Grady
Human Microbiome Project Consortium
I Koychev
JD Forbes
K Sim
LP Coelho
LR Thompson
M Bawa
MW Libbrecht
Q Zhang
R Bovee
S Ioffe
S Seth
SY Anvar
T Brown
T Haveliwala
VB Dubinkina
W Wu
XC Morgan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2019
Field of study

Background: The growth in publically available microbiome data in recent years has yielded an invaluable resource for genomic research, allowing for the design of new studies, augmentation of novel datasets and reanalysis of published works. This vast amount of microbiome data, as well as the widespread proliferation of microbiome research and the looming era of clinical metagenomics, means there is an urgent need to develop analytics that can process huge amounts of data in a short amount of time. To address this need, we propose a new method for the compact representation of microbiome sequencing data using similarity-preserving sketches of streaming k-mer spectra. These sketches allow for dissimilarity estimation, rapid microbiome catalogue searching and classification of microbiome samples in near real time. Results: We apply streaming histogram sketching to microbiome samples as a form of dimensionality reduction, creating a compressed ‘histosketch’ that can efficiently represent microbiome k-mer spectra. Using public microbiome datasets, we show that histosketches can be clustered by sample type using the pairwise Jaccard similarity estimation, consequently allowing for rapid microbiome similarity searches via a locality sensitive hashing indexing scheme. Furthermore, we use a ‘real life’ example to show that histosketches can train machine learning classifiers to accurately label microbiome samples. Specifically, using a collection of 108 novel microbiome samples from a cohort of premature neonates, we trained and tested a random forest classifier that could accurately predict whether the neonate had received antibiotic treatment (97% accuracy, 96% precision) and could subsequently be used to classify microbiome data streams in less than 3 s. Conclusions: Our method offers a new approach to rapidly process microbiome data streams, allowing samples to be rapidly clustered, indexed and classified. We also provide our implementation, Histosketching Using Little K-mers (HULK), which can histosketch a typical 2 GB microbiome in 50 s on a standard laptop using four cores, with the sketch occupying 3000 bytes of disk space

University of Liverpool Repository

Crossref

University of Birmingham Research Portal

Directory of Open Access Journals

Spiral - Imperial College Digital Repository

University of East Anglia digital repository